Search CORE

14 research outputs found

An Analysis of Scientific Practice towards FAIR Digital Objects

Author: De Smedt Koenraad
Koureas Dimitris
Wittenburg Peter
Publication venue: EUDAT
Publication date: 01/01/2019
Field of study

University of Bergen

NORA - Norwegian Open Research Archives

Digital Object Cloud for linking natural science collections information; The case of DiSSCo

Author: Addink Wouter
Hardisty Alex
Koureas Dimitris
Publication venue: Pensoft Publishers
Publication date: 01/01/2018
Field of study

DiSSCo (The Distributed System of Scientific Collections) is a Research Infrastructure (RI) aiming at providing unified physical (transnational), remote (loans) and virtual (digital) access to the approximately 1.5 billion biological and geological specimens in collections across Europe. DiSSCo represents the largest ever formal agreement between natural science museums (114 organisations across 21 European countries). With political and financial support across 14 European governments and a robust governance model DiSSCo will deliver, by 2025, a series of innovative end-user discovery, access, interpretation and analysis services for natural science collections data. As part of DiSSCo's developing data model, we evaluate the application of Digital Objects (DOs), which can act as the centrepiece of its architecture. DOs have bit-sequences representing some content, are identified by globally unique persistent identifiers (PIDs) and are associated with different types of metadata. The PIDs can be used to refer to different types of information such as locations, checksums, types and other metadata to enable immediate operations. In the world of natural science collections, currently fragmented data classes (inter alia genes, traits, occurrences) that have derived from the study of physical specimens, can be re-united as parts in a virtual container (i.e., as components of a Digital Object). These typed DOs, when combined with software agents that scan the data offered by repositories, can act as complete digital surrogates of the physical specimens. In this paper we: 1. investigate the architectural and technological applicability of DOs for large scale data RIs for bio- and geo-diversity, 2. identify benefits and challenges of a DO approach for the DiSSCo RI and 3. describe key specifications (incl. metadata profiles) for a specimen-based new DO type

Online Research @ Cardiff

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

ARPHA Preprints

FAIR data and services in biodiversity science and geoscience

Author: Hardisty Alex
Koureas Dimitris
Lannom Larry
Publication venue: 'MIT Press - Journals'
Publication date: 31/05/2020
Field of study

We examine the intersection of the FAIR principles (Findable, Accessible, Interoperable and Reusable), the challenges and opportunities presented by the aggregation of widely distributed and heterogeneous data about biological and geological specimens, and the use of the Digital Object Architecture (DOA) data model and components as an approach to solving those challenges that offers adherence to the FAIR principles as an integral characteristic. This approach will be prototyped in the Distributed System of Scientific Collections (DiSSCo) project, the pan-European Research Infrastructure which aims to unify over 110 natural science collections across 21 countries. We take each of the FAIR principles, discuss them as requirements in the creation of a seamless virtual collection of bio/geo specimen data, and map those requirements to Digital Object components and facilities such as persistent identification, extended data typing, and the use of an additional level of abstraction to normalize existing heterogeneous data structures. The FAIR principles inform and motivate the work and the DO Architecture provides the technical vision to create the seamless virtual collection vitally needed to address scientific questions of societal importance

Online Research @ Cardiff

Conceptual design blueprint for the DiSSCo digitization infrastructure - DELIVERABLE D8.1

Author: Casino Ana
Dillen Mathias
Groom Quentin
Gördderz Karsten
Hardisty Alex
Hardy Helen
Koureas Dimitris
Nieva De La Hidalga Abraham
Paul Deborah L
Runnel Veljo
Saarenmaa Hannu
van Walsum Myriam
Vermeersch Xavier
Willemse Luc
Publication venue: Pensoft Publishers
Publication date: 01/01/2020
Field of study

DiSSCo, the Distributed System of Scientific Collections, is a pan-European Research Infrastructure (RI) mobilising, unifying bio- and geo-diversity information connected to the specimens held in natural science collections and delivering it to scientific communities and beyond. Bringing together 120 institutions across 21 countries and combining earlier investments in data interoperability practices with technological advancements in digitisation, cloud services and semantic linking, DiSSCo makes the data from natural science collections available as one virtual data cloud, connected with data emerging from new techniques and not already linked to specimens. These new data include DNA barcodes, whole genome sequences, proteomics and metabolomics data, chemical data, trait data, and imaging data (Computer-assisted Tomography (CT), Synchrotron, etc.), to name but a few; and will lead to a wide range of end-user services that begins with finding, accessing, using and improving data. DiSSCo will deliver the diagnostic information required for novel approaches and new services that will transform the landscape of what is possible in ways that are hard to imagine today. With approximately 1.5 billion objects to be digitised, bringing natural science collections to the information age is expected to result in many tens of petabytes of new data over the next decades, used on average by 5,000 – 15,000 unique users every day. This requires new skills, clear policies and robust procedures and new technologies to create, work with and manage large digital datasets over their entire research data lifecycle, including their long-term storage and preservation and open access. Such processes and procedures must match and be derived from the latest thinking in open science and data management, realising the core principles of 'findable, accessible, interoperable and reusable' (FAIR). Synthesised from results of the ICEDIG project ('Innovation and Consolidation for Large Scale Digitisation of Natural Heritage', EU Horizon 2020 grant agreement No. 777483) the DiSSCo Conceptual Design Blueprint covers the organisational arrangements, processes and practices, the architecture, tools and technologies, culture, skills and capacity building and governance and business model proposals for constructing the digitisation infrastructure of DiSSCo. In this context, the digitisation infrastructure of DiSSCo must be interpreted as that infrastructure (machinery, processing, procedures, personnel, organisation) offering Europe-wide capabilities for mass digitisation and digitisation-on-demand, and for the subsequent management (i.e., curation, publication, processing) and use of the resulting data. The blueprint constitutes the essential background needed to continue work to raise the overall maturity of the DiSSCo Programme across multiple dimensions (organisational, technical, scientific, data, financial) to achieve readiness to begin construction. Today, collection digitisation efforts have reached most collection-holding institutions across Europe. Much of the leadership and many of the people involved in digitisation and working with digital collections wish to take steps forward and expand the efforts to benefit further from the already noticeable positive effects. The collective results of examining technical, financial, policy and governance aspects show the way forward to operating a large distributed initiative i.e., the Distributed System of Scientific Collections (DiSSCo) for natural science collections across Europe. Ample examples, opportunities and need for innovation and consolidation for large scale digitisation of natural heritage have been described. The blueprint makes one hundred and four (104) recommendations to be considered by other elements of the DiSSCo Programme of linked projects (i.e., SYNTHESYS+, COST MOBILISE, DiSSCo Prepare, and others to follow) and the DiSSCo Programme leadership as the journey towards organisational, technical, scientific, data and financial readiness continues. Nevertheless, significant obstacles must be overcome as a matter of priority if DiSSCo is to move beyond its Design and Preparatory Phases during 2024. Specifically, these include: Organisational: Strengthen common purpose by adopting a common framework for policy harmonisation and capacity enhancement across broad areas, especially in respect of digitisation strategy and prioritisation, digitisation processes and techniques, data and digital media publication and open access, protection of and access to sensitive data, and administration of access and benefit sharing. Pursue the joint ventures and other relationships necessary to the successful delivery of the DiSSCo mission, especially ventures with GBIF and other international and regional digitisation and data aggregation organisations, in the context of infrastructure policy frameworks, such as EOSC. Proceed with the explicit aim of avoiding divergences of approach in global natural science collections data management and research. Technical: Adopt and enhance the DiSSCo Digital Specimen Architecture and, specifically as a matter of urgency, establish the persistent identifier scheme to be used by DiSSCo and (ideally) other comparable regional initiatives. Establish (software) engineering development and (infrastructure) operations team and direction essential to the delivery of services and functionalities expected from DiSSCo such that earnest engineering can lead to an early start of DiSSCo operations. Scientific: Establish a common digital research agenda leveraging Digital (extended) Specimens as anchoring points for all specimen-associated and -derived information, demonstrating to research institutions and policy/decision-makers the new possibilities, opportunities and value of participating in the DiSSCo research infrastructure. Data: Adopt the FAIR Digital Object Framework and the International Image Interoperability Framework as the low entropy means to achieving uniform access to rich data (image and non-image) that is findable, accessible, interoperable and reusable (FAIR). Develop and promote best practice approaches towards achieving the best digitisation results in terms of quality (best, according to agreed minimum information and other specifications), time (highest throughput, fast), and cost (lowest, minimal per specimen). Financial Broaden attractiveness (i.e., improve bankability) of DiSSCo as an infrastructure to invest in. Plan for finding ways to bridge the funding gap to avoid disruptions in the critical funding path that risks interrupting core operations; especially when the gap opens between the end of preparations and beginning of implementation due to unsolved political difficulties. Strategically, it is vital to balance the multiple factors addressed by the blueprint against one another to achieve the desired goals of the DiSSCo programme. Decisions cannot be taken on one aspect alone without considering other aspects, and here the various governance structures of DiSSCo (General Assembly, advisory boards, and stakeholder forums) play a critical role over the coming years

Online Research @ Cardiff

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

ARPHA OAI-PMH Endpoint

ARPHA Preprints

The Bari Manifesto : An interoperability framework for essential biodiversity variables

Author: Agosti Donat
Alonso Garcia Enrique
Bastin Lucy
Belbin Lee
Bowser Anne
Buttigieg Pier Luigi
Canhos Dora A. L.
De Giovanni Renato
Egloff Willi
Figueira Rui
Groom Quentin
Guralnick Robert P.
Hardisty Alex R.
Hobern Donald
Hugo Wim
Ji Liqiang
Kissling W. Daniel
Koureas Dimitris
Los Wouter
Manset David
Manuel Jeffrey
Michener William K.
Poelen Jorrit
Saarenmaa Hannu
Schigel Dmitry
Uhlir Paul F.
Publication venue
Publication date: 01/01/2019
Field of study

Essential Biodiversity Variables (EBV) are fundamental variables that can be used for assessing biodiversity change over time, for determining adherence to biodiversity policy, for monitoring progress towards sustainable development goals, and for tracking biodiversity responses to disturbances and management interventions. Data from observations or models that provide measured or estimated EBV values, which we refer to as EBV data products, can help to capture the above processes and trends and can serve as a coherent framework for documenting trends in biodiversity. Using primary biodiversity records and other raw data as sources to produce EBV data products depends on cooperation and interoperability among multiple stakeholders, including those collecting and mobilising data for EBVs and those producing, publishing and preserving EBV data products. Here, we encapsulate ten principles for the current best practice in EBV-focused biodiversity informatics as 'The Bari Manifesto', serving as implementation guidelines for data and research infrastructure providers to support the emerging EBV operational framework based on trans-national and cross-infrastructure scientific workflows. The principles provide guidance on how to contribute towards the production of EBV data products that are globally oriented, while remaining appropriate to the producer's own mission, vision and goals. These ten principles cover: data management planning; data structure; metadata; services; data quality; workflows; provenance; ontologies/vocabularies; data preservation; and accessibility. For each principle, desired outcomes and goals have been formulated. Some specific actions related to fulfilling the Bari Manifesto principles are highlighted in the context of each of four groups of organizations contributing to enabling data interoperability - data standards bodies, research data infrastructures, the pertinent research communities, and funders. The Bari Manifesto provides a roadmap enabling support for routine generation of EBV data products, and increases the likelihood of success for a global EBV framework.Peer reviewe

Online Research @ Cardiff

Aston Publications Explorer

Electronic Publication Information Center

Helsingin yliopiston digitaalinen arkisto

MPG.PuRe

UvA-DARE

International Migration, Integration and Social Cohesion online publications

The role of natural science collections in the biomonitoring of environmental contaminants in apex predators in support of the EU's zero pollution ambition

Author: Alygizakis Nikiforos
Androulakakis Andreas
Badry Alexander
Baltag Emanuel
Barbagli Fausto
Bauer Kevin
Biesmeijer Koos
Borgo Enrico
Cincinelli Alessandra
Classen Daniela
Danielsson Sara
Dekker Rene W. R. J.
Dietz Rune
Duke Guy
Eens Marcel
Espin Silvia
Eulaers Igor
Frahnert Sylke
Fuchs Jerome
Fuiz Tibor I.
Garcia-Fernandez Antonio J.
Gkotsis Georgios
Glowacka Natalia
Gomez-Ramirez Pilar
Grotti Marco
Hosner Peter A.
Jaspers Veerle L. B.
Johansson Ulf
Koschorreck Jan
Koureas Dimitris
Krone Oliver
Kubin Eero
Lefevre Christine
Leivits Madis
Lo Brutto Sabrina
Lopes Ricardo Jorge
Lourenco Rui
Lymberakis Petros
Madslien Knut
Martellini Tania
Mateo Rafael
Movalli Paola
Nika Maria-Christina
Osborn Dan
Oswald Peter
Pauwels Olivier
Pereira MGloria
Pezzo Francesco
Sanchez-Virosta Pablo
Sarajlic Nermina
Shore Richard F.
Slobodnik Jaroslav
Soler Francisco
Sonne Christian
Thomaidis Nikolaos
Treu Gabriele
Töpfer Till
van den Brink Nico
Vrezec Al
Väinölä Risto
Walker Lee
Weigl Stephan
Wernham Chris
Woog Friederike
Zorrilla Irene
Publication venue
Publication date: 01/01/2022
Field of study

The chemical industry is the leading sector in the EU in terms of added value. However, contaminants pose a major threat and significant costs to the environment and human health. While EU legislation and international conventions aim to reduce this threat, regulators struggle to assess and manage chemical risks, given the vast number of substances involved and the lack of data on exposure and hazards. The European Green Deal sets a 'zero pollution ambition for a toxic free environment' by 2050 and the EU Chemicals Strategy calls for increased monitoring of chemicals in the environment. Monitoring of contaminants in biota can, inter alia: provide regulators with early warning of bioaccumulation problems with chemicals of emerging concern; trigger risk assessment of persistent, bioaccumulative and toxic substances; enable risk assessment of chemical mixtures in biota; enable risk assessment of mixtures; and enable assessment of the effectiveness of risk management measures and of chemicals regulations overall. A number of these purposes are to be addressed under the recently launched European Partnership for Risk Assessment of Chemicals (PARC). Apex predators are of particular value to biomonitoring. Securing sufficient data at European scale implies large-scale, long-term monitoring and a steady supply of large numbers of fresh apex predator tissue samples from across Europe. Natural science collections are very well-placed to supply these. Pan-European monitoring requires effective coordination among field organisations, collections and analytical laboratories for the flow of required specimens, processing and storage of specimens and tissue samples, contaminant analyses delivering pan-European data sets, and provision of specimen and population contextual data. Collections are well-placed to coordinate this. The COST Action European Raptor Biomonitoring Facility provides a well-developed model showing how this can work, integrating a European Raptor Biomonitoring Scheme, Specimen Bank and Sampling Programme. Simultaneously, the EU-funded LIFE APEX has demonstrated a range of regulatory applications using cutting-edge analytical techniques. PARC plans to make best use of such sampling and biomonitoring programmes. Collections are poised to play a critical role in supporting PARC objectives and thereby contribute to delivery of the EU's zero-pollution ambition.Non peer reviewe

HAL Descartes

Copenhagen University Research Information System

Helsingin yliopiston digitaalinen arkisto

Hal-Diderot

Hochschulbibliothekszentrum des Landes Nordrhein-Westfalen (hbz)

NERC Open Research Archive

Archivio istituzionale della ricerca - Università di Palermo

FAIR Digital Objects for Science: From Data Pieces to Actionable Knowledge Units

Author: De Smedt Koenraad
Koureas Dimitris
Wittenburg Peter
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

Data science is facing the following major challenges: (1) developing scalable cross-disciplinary capabilities, (2) dealing with the increasing data volumes and their inherent complexity, (3) building tools that help to build trust, (4) creating mechanisms to efficiently operate in the domain of scientific assertions, (5) turning data into actionable knowledge units and (6) promoting data interoperability. As a way to overcome these challenges, we further develop the proposals by early Internet pioneers for Digital Objects as encapsulations of data and metadata made accessible by persistent identifiers. In the past decade, this concept was revisited by various groups within the Research Data Alliance and put in the context of the FAIR Guiding Principles for findable, accessible, interoperable and reusable data. The basic components of a FAIR Digital Object (FDO) as a self-contained, typed, machine-actionable data package are explained. A survey of use cases has indicated the growing interest of research communities in FDO solutions. We conclude that the FDO concept has the potential to act as the interoperable federative core of a hyperinfrastructure initiative such as the European Open Science Cloud (EOSC)

Multidisciplinary Digital Publishing Institute

University of Bergen

NORA - Norwegian Open Research Archives

Making small data big : What can Scratchpads do for you?

Author: Dimitris Koureas (464273)
Laurence Livermore (100501)
Publication venue
Publication date
Field of study

Eight case studies from the 550 communities using Scratchpads. Scratchpads is an open source and free to use platform that enables you to work in a collaborative online environment. With a Scratchpad you can easily create your own website to structure, manage, link and publish your biodiversity data.</p

FigShare

‘The Last Mile’: The registry behind the identifier [Conference Abstract]

Author: Addink Wouter
Hardisty Alex
Koureas Dimitris
Lannom Larry
Weiland Claus
Publication venue: Pensoft
Publication date: 01/01/2019
Field of study

Preserved specimens in natural science collections have lifespans of many decades and often, several hundreds of years. Specimens must be unambiguously identifiable and traceable in the face of changes in physical location, changes in organisation of the collection to which they belong, and changes in classification. When digitizing museum collections, a clear link must be maintained between the physical specimen itself and the information digitally representing that specimen in cyberspace. The idea of a Natural Science Identifier (NSId) as a neutral, unique, universal and stable long-term persistent identifier (PID) of a ‘Digital Specimen’ is central to museums’ ambitions for widening access. An NSId allows easy identification and referencing of specific Digital Specimens, regardless of type, location, owner or user. It provides a digital doorway to physical specimens through which services for arranging loans and visits can be accessed, as well as opening the door to innovative services for manipulating specimens’ information directly; for work reliant upon discovery of related third-party information; and for demanding 3D modelling and visualization of specimens. Because the work takes place within e-Infrastructures/Cyberspace, new possibilities for analysing hundreds of thousands of specimens simultaneously are opened by exploiting large-scale cloud computing capacity and deep mining/machine learning, for example. There are several established identifier mechanisms that could be used as a basis for NSId, but some variant of Handles is most appropriate over the very long-term because of their neutrality, resistance to change and sustainability. Adopted uses of the Handle system include identification of journal articles and datasets in education and research (using Digital Object Identifiers); film and television programme assets in the entertainment sector; financial derivatives; and for international shipping and construction. Aside from being stable and sustained over time, an essential requirement of a global PID mechanism is independence from the museums/institutions assigning identifiers. NSIds are opaque insofar as no information can or should be inferred solely by inspecting the identifier. Stakeholders change, collections move, and organisations evolve, merge or disappear. Even designations and descriptions of specimens and collections can change. Information should only be revealed when the identifier is resolved via a neutral index. One can debate the most appropriate instantiation of the Handle system but this is not useful. Relevance, ease of use and added-value of the supporting ‘NSId Registry’ (NSIdR) – the index of the different kinds of natural science object and their relations – are the decisive factors. This can be seen from the example of the Entertainment Identifier Registry (EIDR) founded by the major motion picture studios to create a reliable way to identify and track film and TV content distribution. Focus on the object model, promotional branding and value perception in the target user segment are the critical factors for success. Providing such a registry, seamlessly coupled to work practices and language of the professionals addresses the last mile challenge (Koureas et al. 2016). From specimens, class characteristics, storage containers and collections, to specific identifications, images, naming, literature references and more, the NSIdR’s triple-hierarchy object model, rooted in OBO Foundry’s Biological Collections Ontology, is the key to persistently identifying, relating and indexing the entire range of collection objects of interest to scientists and others working in the bio and geo realms. The NSIdR ‘knowledge graph’, interoperable with other identifier schemes, supports novel first- and third-party value-add services such as arranging loans and visits, curation and annotation, and machine-learning for relationship discovery and pattern exploration

Online Research @ Cardiff

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

ARPHA OAI-PMH Endpoint

ARPHA Preprints

Diffusion-weighted MRI of the bone marrow: ADC values of multiple myeloma patterns

Author: Koutoulidis Vassilis Fontara, Sophia Terpos, Evangelos Zagouri, Flora Matsaridis, Dimitris Christoulas, Dimitrios Koureas, Andreas Dimopoulos, Meletios Athanassios Moulopoulos, L others
Publication venue
Publication date: 01/01/2015
Field of study

Pergamos : Unified Institutional Repository / Digital Library Platform of the National and Kapodistrian University of Athens